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Identifying community structure in networks is an issue of particular interest in network science. 
The modularity introduced by Newman and Girvan [Phys. Rev. E 69, 026113 (2004)] is the most 
popular quality function for community detection in networks. In this study, we identify a problem 
in the concept of modularity and suggest a solution to overcome this problem. Specifically, we obtain 
a new quality function for community detection. We refer to the function as Z-modularity because 
it measures the Z-score of a given division with respect to the fraction of the number of edges within 
communities. Our theoretical analysis shows that Z-modularity mitigates the resolution limit of 
the original modularity in certain cases. Computational experiments using both artificial networks 
and well-known real-world networks demonstrate the validity and reliability of the proposed quality 
function. 
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I. INTRODUCTION 

Many complex systems can be represented as networks. 
Analyzing the structure and dynamics of these networks 
provides meaningful information about the underlying 
systems. In fact, complex networks have attracted sig¬ 
nificant attention from diverse fields such as physics, in¬ 
formatics, chemistry, biology, and sociology DS- 

An issue of particular interest in network science is 
the identification of community structure [3j. Roughly 
speaking, a community (also referred to as a module) is 
a subset of vertices more densely connected with each 
other than with nodes in the rest of the network. Note 
that no absolute definition of a community exists because 
any such definition typically depends on the specific sys¬ 
tem at hand. Detecting communities is a powerful way 
to discover components that have some special roles or 
possess important functions. For example, consider the 
network representing the World Wide Web, where ver¬ 
tices correspond to web pages and edges represent the 
hyperlinks between pages. Communities in this network 
are likely to be the sets of web pages dealing with the 
same or similar topics. 

There are various methods to detect community struc¬ 
ture in networks, which can be roughly divided into two 
types. First, there are methods based on some condi¬ 
tions that should be satisfied by a community. The most 
fundamental concept is a clique. A clique is a subset of 
vertices wherein every pair of vertices is connected by an 
edge. As even a singleton or an edge is a clique, we are 
usually interested in finding a maximum clique or a max¬ 
imal clique, i.e., cliques with maximum size and cliques 
not contained in any other clique, respectively. Although 
the definition of a clique is very intuitive, it is too strong 
and restrictive to use practically. In 2004, Radicchi et 
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al. @] introduced more practical definitions: a commu¬ 
nity in a strong sense and a community in a weak sense. 
A subset S of vertices is called a community in a strong 
sense if for every vertex in S, the number of neighbors in 
S is strictly greater than the number of neighbors outside 
S. On the other hand, a subset S of vertices is called a 
community in a weak sense if the sum, over all vertices 
in S, of the number of neighbors in S is strictly greater 
than the number of cut edges of S. Thus, if a subset of 
vertices is a community in a strong sense, then it is also 
a community in a weak sense. Recently, Cafieri et al. [5] 
proposed an enumerative algorithm to list all divisions 
of the set of vertices into communities in a strong sense 
with moderate sizes. 

Second, but perhaps more importantly, there are meth¬ 
ods that maximize a globally defined quality function. 
The best known and most commonly used quality func¬ 
tion is modularity, which was introduced by Newman 
and Girvan jj5]. Here let G = (V,E) be an undirected 
network consisting of n = |V| vertices and m = \E\ 
edges. The modularity, a quality function for division 

c = {Cl,..., C k } of V (i.e., Ui=i Ci = V and = 0 

for i 7 ^ j), can be written as 



where me is the number of edges in community C, and 
Dc is the sum of the degrees of the vertices in com¬ 
munity C. The modularity represents the sum, over all 
communities, of the fraction of the number of edges in the 
communities minus the expected fraction of such edges 
assuming that they are placed at random with the same 
distribution of vertex degree. 

Many studies have examined modularity maximiza¬ 
tion. In 2008, Brandes et al. [7] proved that modularity 
maximization is NP-hard. This implies that unless P = 
NP, no modularity maximization method that simultane¬ 
ously satisfies the following exists: (i) finds a division that 
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maximizes modularity exactly (ii) in time polynomial in 
n and to (iii) for any networks. To date, a major focus 
in modularity maximization has been designing accurate 
and scalable heuristics. In fact, there are a wide variety of 
algorithms based on greedy techniques simulated 

annealing jT0Hl2] . extremal optimization [13], spectral 
optimization in ns, mathematical programming na¬ 
ns], and other techniques. Note that to reduce compu¬ 
tation time, a few pre-processing techniques have been 
proposed |20j . Moreover, to improve the quality of divi¬ 
sions obtained by such heuristics, some post-processing 
algorithms have also been developed m- 

Although modularity maximization is the most pop¬ 
ular and widely used method in practice, it is also 
known to have some serious drawbacks; i.e., the resolu¬ 
tion limit [22| and degeneracies [23] ■ The former means 
that modularity maximization fails to detect communi¬ 
ties smaller than a certain scale depending on the total 
number of edges in a network even if the communities 
are cliques connected by single edges. The latter means 
that there exist numerous nearly optimal divisions in 
terms of modularity maximization, which makes finding 
communities with maximum modularity extremely dif¬ 
ficult. The resolution limit particularly narrows the ap¬ 
plication range of modularity maximization because most 
real-world networks consist of communities with very dif¬ 
ferent sizes. To avoid this issue, some multiresolution 
variants of the modularity have been adopted in practi¬ 
cal applications In these variants, the resolution 

level can be tuned freely by adjusting certain parame¬ 
ters. However, once the resolution level is determined, 
communities larger than the determined resolution level 
tend to be divided and smaller communities tend to be 
merged. Therefore, such multiresolution variants also fail 
to detect real community structure [27] . 

In this study, we identify a problem in the concept 
of modularity and suggest a solution to overcome this 
problem. Specifically, we obtain a new quality function 
for community detection. We refer to this function as Z- 
modularity because it measures the Z-score of a given di¬ 
vision with respect to the fraction of the number of edges 
within communities. Our theoretical analysis shows that 
Z-modularity mitigates the resolution limit of the original 
modularity in certain cases. In fact, Z-modularity never 
merges adjacent cliques in the well-known ring of cliques 
network with any number and size of cliques. Compu¬ 
tational experiments using both artificial networks and 
well-known real-world networks demonstrate the validity 
and reliability of the proposed quality function. 

Note that there are many quality functions based on 
modularity or other concepts PSElFkTI . Most of them are 
collected in Ref. [J . 

This paper is structured as follows. In Sec. |TT] our 
quality function Z-modularity is introduced. In Sec. m 
a theoretical analysis of the properties of Z-modularity is 
described. The results of computational experiments are 
shown in Sec. m Finally, conclusions and suggestions for 
future work are given in Sec. [V] 



FIG. 1: (Color online) Probability distributions. 


II. DEFINITION OF Z-MODULARITY 

Modularity simply computes the fraction of the num¬ 
ber of edges within communities minus its expected 
value. The definition is quite intuitive; thus, it is the 
most popular and widely used quality function in prac¬ 
tice. 

However, we identify a problem with the concept of 
modularity. Here consider two divisions C\ and Ci- As¬ 
sume that the fraction of the number of edges within com¬ 
munities of Ci and C 2 are 0.2 and 0.6, respectively. In ad¬ 
dition, assume that their expected values are 0.1 and 0.5, 
respectively. Then, we see that these two divisions share 
the same modularity value (i.e., Q(Ci) = Q(Ci) = 0.1). 
The key question is as follows: should these two divi¬ 
sions receive the same quality value? Our answer is that 
it must depend on the variance of the probability dis¬ 
tribution of the fraction of the number of edges within 
communities of Ci and C 2 . Fig. [l] illustrates an example. 
In this case, we wish to assign a higher quality value to Ci 
because it is statistically much rarer than C 2 . This sim¬ 
ple but critical observation forms the basis of our quality 
function. 

Given an undirected network G = (V, E) consisting of 
n = |V| vertices and m = \E\ edges, and a division C 
of V, we aim to quantify the statistical rarity of division 
C in terms of the fraction of the number of edges within 
communities. To this end, we consider the following edge 
generation process over V. Place N edges over V at ran¬ 
dom with the same distribution of vertex degree. Then, 
when we place an edge, the probability that the edge is 
placed within communities is given by 



Note that this edge generation process is the same as the 
null-model (also known as the configuration model [34]) 
used in the definition of modularity, with the exception of 
the sample size. We simply wish to estimate the proba¬ 
bility distribution of the fraction of the number of edges 
within communities. Thus, unlike the null-model, the 
sample size N is not necessarily equal to the number of 
edges to. 
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Let X be a random variable denoting the number 
of edges generated by the process within communities. 
Then, X follows the binomial distribution B(N,p). By 
the central limit theorem, when the sample size N is 
sufficiently large, the distribution of X/N can be ap¬ 
proximated by the normal distribution J\f (p,p(l — p)/N). 
Thus, we can quantify the statistical rarity of division C 
in terms of the fraction of the number of edges within 
communities using the Z-score as follows: 



The sample size N never depends on a given division; 
thus, it is omitted in the denominator. We refer to this 
quality function as Z-modularity. 


III. THEORETICAL ANALYSIS 

Fortunato and Barthelemy [22j pointed out the reso¬ 
lution limit of modularity. This resolution limit means 
that modularity maximization fails to detect communi¬ 
ties that are smaller than a certain scale depending on the 
total number of edges in a network. This phenomenon 
occurs even if the communities are cliques connected by 
single edges. Here we theoretically analyze Z-modularity 
from a resolution limit perspective. As a result, we 
demonstrate that Z-modularity mitigates the resolution 
limit of the original modularity in certain cases. 


A. Ring of cliques network 



FIG. 2: (Color online) Ring of cliques network. K p 
represents a clique with p vertices. 


(X3Li( s */A) A = 1 A- Here define 

, , 1 - y/m - x 

t(x.y) = — , —. 

yjx{\ - X ) 

Then, the derivative of f(x, y) with respect to x is 

d_ -x-y/m.-(l-y/m)(l-x) 

dx 11 ,V) 2 • (x(l — x )) 3/2 

for 0 < x < 1 and 1 < y < m. Thus, we obtain 

f (1/1,1) >f(t,l). 

Moreover, the derivative of f(l/y,y) with respect to y is 

d / ■\ (m — 3y)(y — 1) + y 

= 2m-( # -1)3/* 5 0 

for 1 < y < m/3. Thus, we have 


First, we consider a ring of cliques network that con¬ 
sists of a number of cliques connected by single edges 
(Fig. §■ Assume that each clique consists of p (> 3) 
vertices and the number of cliques is q (> 2). Then, the 
network has n = p- q vertices and m = q ■ (1 +p(p — l)/2) 
edges. Fortunato and Barthelemy [223 showed that mod¬ 
ularity maximization would merge adjacent cliques if q 
is larger than a certain value depending on p. However, 
adjacent cliques are never merged in a division with max¬ 
imal Z-modularity value, as shown below. 

Let C* be the division of V into the cliques. In addi¬ 
tion, let C = {Ci,..., C{\ (1 < l < q) be a division of V 
such that each Ci consists of a series of Si (> 1) cliques 
and q = X3,:=i s i- Then, Z-modularity for C* and C are 
calculated by 


Z(C*) 


1 — q/m — 1 /q 

Vi 1 - 1 / f l)/ < l 


and Z(C) 


1 — l/m — t 

\A(i - 1 ) ’ 


m/q,q) >/(l/U), 

since 1 < l < q < m/4 by m = q ■ (1 + p(p — l)/2) > 4 q. 
Therefore, we have 

Z(C*) = f(l/q,q) > f (1/1,1) > f(t,l) = Z(C ), 

which means that maximizing Z-modularity never merges 
adjacent cliques. 

Table [I] lists the values of modularity and Z-modularity 
of divisions C* and C (sj = 2 for i = 1 ,..., l) for some 
ring of cliques networks. As can be seen, the modular¬ 
ity of C is greater than that of C* when the number of 
cliques is large, which is consistent with Fortunato and 
Barthelemy [22]. On the other hand, as we proved above, 
Z-modularity of C* is certainly higher than that of C for 
every number of cliques. 


B. Network with two pairwise identical cliques 


respectively, where t — X3i=i ( s i/q) 2 - By the Cauchy Here we consider a network with two pairwise identical 

Schwarz inequality, we have 1 > t = X3i=i( s */?) 2 — cliques that consists of a pair of cliques Ci and C 2 with 
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TABLE I: Numerical examples of modularity and 
Z-modularity for some ring of cliques networks. 


n 

m 

P 

Q 

Q(C*) 

Q(C ) 

Z(C*) 

Z(C) 

100 

220 

5 

20 

0.8591 

0.8548 

3.942 

2.848 

200 

440 

5 

40 

0.8841 

0.9045 

5.663 

4.150 

400 

880 

5 

80 

0.8966 

0.9295 

8.070 

5.954 

5000 

11000 

5 

1000 

0.9081 

0.9525 

28.73 

21.32 


C 3 

C 4 


IV. EXPERIMENTAL RESULTS 

The purpose of our computational experiments is to 
evaluate the validity and reliability of the quality function 
Z-modularity. To this end, throughout the experiments, 
we maximize Z-modularity using a simulated annealing 
algorithm. Note that our algorithm is obtained immedi¬ 
ately by changing the objective function from modularity 
to Z-modularity in the algorithm proposed by Guimera 
and Amaral m- The implementation of their algorithm 
can be found on Lancichinetti’s web page [35], and we 
use it with default parameters with the exception of the 
above change of objective function. Our experiments 
are conducted on various artificial networks and on well- 
known real-world networks. 



FIG. 3: (Color online) Network with two pairwise 
identical cliques. K p and K q represent cliques with p 
and q vertices, respectively. 


q vertices each and a pair of cliques C 3 and C 4 with 
p (< q) vertices each. These four cliques are connected 
by single edges, as described in Fig. [3] This network has 
n = 2 (p + q) vertices and m = p{p — 1) + q{q — 1) + 4 
edges. 

Consider two divisions Ca = {Ci, C 2 , C 3 , C 4 } and 
Cb = {C\,C 2 ,C 3 U C 4 }. Note that division Ca is 
more natural community structure that we would like to 
identify. Unfortunately, maximizing Z-modularity may 
choose Cb, he., Z(Ca) < Z{Cb) holds for some pair of p 
and q. However, if modularity maximization adopts Ca, 
then so does Z-modularity, i.e., for any pair of p and q, if 
Q(Ca) > Q(C-b) holds, then Z(Ca) > Z(Cb ) also holds. 
This fact follows from the definitions of Z-modularity and 
the original modularity. 

Table [XT] lists the values of modularity and Z- 
modularity of divisions Ca and Cb for some networks with 
two pairwise identical cliques. We can confirm that both 
modularity and Z-modularity tend to merge C 3 and C 4 
as the sizes of Ci and C 2 become large. However, there 
is the case where only Z-modularity could divide C 3 and 
C 4 . Therefore, we see that Z-modularity again mitigates 
the resolution limit of modularity in this case. 


TABLE II: Numerical examples of modularity and 
Z-modularity for some networks with two pairwise 
identical cliques. 


n 

m 

P 

Q 

Q(Ca) 

Q(C b ) 

Z(C A ) 

Z(C B ) 

26 

80 

5 

8 

0.6618 

0.3385 

1.443 

1.345 

42 

264 

5 

16 

0.5650 

0.5653 

1.144 

1.143 

74 

1016 

5 

32 

0.5182 

0.5190 

1.037 

1.039 

138 

4056 

5 

64 

0.5047 

0.5049 

1.009 

1.010 


A. Artificial networks 


First, we report the results of computational exper¬ 
iments with artificial networks. We compare divisions 
obtained by maximizing Z-modularity with divisions ob¬ 
tained by modularity maximization on a wide variety 
of networks. The modularity is also maximized by the 
simulated annealing algorithm proposed by Guimera and 
Amaral [10]. We deal with three types of artificial net¬ 
works: the planted /-partition model, the Lancichinetti- 
Fortunato-Radicchi (LFR) benchmark, and the Hanoi 
graph. For the planted /-partition model and the LFR 
benchmark, once their parameters are set, the ground- 
truth community structure is fixed. Thus, we can evalu¬ 
ate the quality of the obtained community structure by 
comparison with the ground-truth using some measure. 

To this end, we adopt the normalized mutual informa¬ 
tion introduced by Danon et al. ffiS] , The normalized 
mutual information for two divisions C\ and C 2 of n ver¬ 
tices is defined as follows: 


i(Ci,C 2 


2I(C\,C 2 ) 

HiCJ + H&Y 


where 


m,c 2 )= £ £ 

CieCi c 2 ec 2 


\c 3 nc 2 \ 

n 


log 2 


/ n ■ \C'\ n c 2 | \ 
V \Ci\-\c 2 \ ) 


and 


H{C) 


£ 

cec 



\C\ 


The normalized mutual information ranges from 0 to 1. 
For two divisions C\ and C 2 , the higher the normalized 
mutual information is, the more similar they are (and 
vice versa). In fact, /norm(£ 1 ,^ 2 ) = 1 if C\ and C 2 are 
identical, and / n0 rm(Ci, C 2 ) = 0 if they are independent. 
This measure has often been used to evaluate community 
detection methods. For example, see the computational 
experiments in Refs. [37l[38] , 
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Probability p out 

FIG. 4: (Color online) Results for the planted 
/-partition model. 


Planted l-partition model. The planted /-partition 
model was introduced by Condon and Karp jJTS]. In 
this model, n vertices are divided into / equally sized 
groups. Two vertices in the same group are connected by 
probability p ln , whereas two vertices in different groups 
are connected by probability p out (< p i n ). Throughout 
the experiments, we set pi n = 0.5. We construct four 
networks corresponding to combinations of two different 
network sizes (n = 1000 or 5000) and two different com¬ 
munity sizes (/ = 20 or 50). The parameter p out starts 
with 0.01 and then increases in stages. 

The results are shown in Fig. [4j As can be seen, Z- 
modularity outperforms the original modularity in all 
four cases. In particular, Z-modularity provides much 
more superior results compared to modularity for net¬ 
works consisting of relatively small communities. 

LFR benchmark. In the planted /-partition model, 
each group in a generated network forms the Erdos-Renyi 
random graph |40j . Thus, all vertices have approximately 
the same degree. Moreover, all groups have exactly the 
same size. These phenomena are rarely observed in net¬ 
works in real-world systems. As a more realistic model, 
the LFR benchmark was proposed by Lancichinetti, For- 
tunato, and Radicchi m for the case of unweighted and 
undirected networks. The LFR benchmark was then ex¬ 
tended to the case of directed and weighted networks 
with overlapping communities [42] . We now use the orig¬ 
inal unweighted and undirected case without overlapping 
communities. 

In the model, degree distribution and community size 
distribution follow the power law with exponents 7 and 
/3, respectively. Furthermore, we can specify the num¬ 
ber of vertices n, average degree ( k ), maximum degree 
/c max , minimum community size c m i n , maximum commu¬ 
nity size c max , and mixing parameter p. In particular, 
mixing parameter p indicates the mixing ratio of commu¬ 
nities, i.e., the higher p is, the more densely connected 
the communities are. The model constructs a network 




Mixing parameter p 


FIG. 5: (Color online) Results for the LFR benchmark. 


consistent with the specified parameters. For more de¬ 
tails, see Ref. m- In our experiments, we set the pa¬ 
rameters the same as used in Refs. in 135] as follows: 
7 = —2, [3 = — 1 , ( k) =20, and fc max = 50. We construct 
four networks corresponding to combinations of two dif¬ 
ferent network sizes (n = 1000 or 5000) and two differ¬ 
ent ranges of community size ((c m i n ,c max ) = (10,50) or 
( 20 , 100 )). 

The results are illustrated in Fig. [5] For the smaller 
networks (n = 1000 ), the mutual information values ob¬ 
tained by maximizing Z-modularity are lower than those 
obtained by modularity maximization when p < 0.6 for 
both community size settings. This trend is significant 
when the network consists of relatively large communities 
((c m in,c max ) = (20,100)). On the other hand, for larger 
networks (n = 5000), Z-modularity outperforms the orig¬ 
inal modularity for both community size settings. From 
the above, we see that Z-modularity is particularly suit¬ 
able for identifying community structure when a network 
consists of relatively small communities. 

Here we investigate why the mutual information values 
obtained by maximizing Z-modularity are low when the 
community sizes are large. To this end, Fig. [ 6 ] depicts 
the adjacency matrices of the LFR benchmark network 
with parameters 7 = — 2 , ft = —1, n = 1000 , ( k ) = 20 , 
kmax — 50, c m j n — 20, c max — 100, and p — 0.3. The ver¬ 
tices are ordered according to both the ground-truth par¬ 
tition and the optimal partition for Z-modularity. The 
edges connecting vertices in the same community and in 
different communities are plotted with different colors, 
i.e., red and blue, respectively. As can be seen, maxi¬ 
mizing Z-modularity divides the relatively large ground- 
truth communities because they contain much denser 
communities in the hierarchical structure by random be¬ 
havior. 

Hanoi graph. Here we demonstrate optimal partitions 
with respect to Z-modularity and the original modular¬ 
ity for the Hanoi graph, which is an example of networks 
with hierarchical organization. The Hanoi graph H n cor- 
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(a) Ground-truth 


(b) Z-modularity 


FIG. 6: (Color online) Adjacency matrices for an LFR 
benchmark network. 


responds to the allowed moves in the tower of Hanoi for 
n disks, which is a famous puzzle invented by Edouard 
Lucas in 1883. The Hanoi graph H n has 3 n vertices and 
3 • (3 ra — l)/2 edges. In the context of community detec¬ 
tion in networks, the Hanoi graph H 3 is used by Rosvall 
and Bergstrom [33] , 

The results for Hanoi graph H 4 are shown in Fig. [TJ 
where the label (and color) of each vertex represents the 
community to which the vertex belongs. As can be seen, 
maximizing Z-modularity leads to more detailed parti¬ 
tion than modularity maximization. 


B. Real-world networks 

Here we report the results of computational experi¬ 
ments with real-world networks; i.e., the Zachary’s karate 
club network, the Les Miserables network, and the Amer¬ 
ican college football network. 

Zachary’s karate club network. The first example is 
the famous karate club network analyzed by Zachary FPT| , 
which is often used as a benchmark to evaluate commu¬ 
nity detection methods. It consists of 34 vertices rep¬ 
resenting the members in a karate club in an American 
university, in addition to 78 edges representing friend¬ 
ship relations among individuals. Because of a conflict 
between the club administrator and the instructor, the 
club members split into two groups, one supporting the 
administrator and the other supporting the instructor. 
Therefore, these groups can be viewed as a ground-truth 
community structure. 

The division obtained by maximizing Z-modularity is 
shown in Fig. [8j where vertices with the same color rep¬ 
resent a community. The label of each vertex represents 
an identification number of the member. For example, 
1 and 34 represent the administrator and the instructor, 
respectively. The dashed line gives the division of the 
network into the above two groups. Although the com¬ 
munity {3,10, 29} straddles two groups, the other com¬ 
munities are all contained in either one of the groups. 



(a) Optimal partition for Z-modularity: 27 communities, 
Z = 3.376, and Q = 0.6379. 



(b) Optimal partition for modularity: 9 communities, 
Z = 2.510, and Q = 0.7889. 


FIG. 7: (Color online) Community structure for Hanoi 
graph H 4 . 


Les Miserables network. The second example is the 
network of the characters in the novel Les Miserables by 
Victor Hugo, compiled by Knuth [45] . It consists of 77 
vertices representing the characters and 254 edges indi¬ 
cating the co-appearance of characters. 

The division obtained by maximizing Z-modularity is 
presented in Fig. [9] where vertices with the same color 
represent a community. The label of each vertex rep¬ 
resents the name of the character. Identified commu¬ 
nities are likely to correspond to specific groups within 
the story. For example, the community consisting of 12 
vertices (shaded with light brown) at the top left corner 
contains major characters belonging to the revolutionary 
student club Friends of the ABC. 
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FIG. 8: (Color online) Community structure for 
Zachary’s karate club network: 6 communities, 
Z = 0.9266, and Q = 0.3882. 
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FIG. 9: (Color online) Community structure for Les 
Miserables network: 9 communities, Z = 1.490, and 
Q = 0.5245. 


American college football network. The third and fi¬ 
nal example is a network of college football teams in the 
United States, which was derived by Girvan and New¬ 
man [46]. There are 115 vertices representing the football 
teams, and 654 edges connecting teams that played each 
other in a regular season. The teams are divided into 
12 groups referred to as conferences containing approxi¬ 
mately 10 teams each. More games are played between 
teams in the same conference than between teams in dif¬ 
ferent conferences. Thus, the conferences can be viewed 
as a ground-truth community structure. 

The division obtained by maximizing Z-modularity is 
shown in Fig. |10[ where vertices with the same color rep¬ 
resent a community. Note that the label of each ver¬ 
tex now represents the conference to which the team 
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FIG. 10: (Color online) Community structure for 
American college football network: 14 communities, 
Z = 2.111, and Q = 0.5738. 


belongs rather than an identification number of the 
team. Although some misclassifications are observed, Z- 
modularity correctly identifies 7 out of 12 conferences 
(i.e., conferences 0, 1, 2, 3, 7, 8, and 9). This result is out¬ 
standing in comparison with divisions obtained by mod¬ 
ularity maximization. In fact, as reported in Ref. m, 
only four conferences were correctly recovered by divi¬ 
sion with a higher modularity value Q = 0.6046. 


V. CONCLUSIONS 

In this study, we have identified a problem in the con¬ 
cept of modularity and suggested a solution to overcome 
this problem. Specifically, we have obtained a new qual¬ 
ity function Z-modularity that measures the Z-score of a 
given division with respect to the fraction of the number 
of edges within communities. Theoretical analysis has 
shown that Z-modularity mitigates the resolution limit 
of the original modularity in certain cases. In fact, Z- 
modularity never merges adjacent cliques in the well- 
known ring of cliques network with any number and size 
of cliques. In computational experiments, we have evalu¬ 
ated the validity and reliability of Z-modularity. The re¬ 
sults for artificial networks show that Z-modularity more 
accurately detects the ground-truth community structure 
than the original modularity in most cases. In particu¬ 
lar, Z-modularity outperforms modularity for networks 
consisting of relatively small communities. Furthermore, 
the results for real-world networks demonstrate that Z- 
modularity leads to natural and reasonable community 
structure in practical use. Therefore, we conclude that Z- 
modularity could be another option for the quality func¬ 
tion in community detection. 

In the future, further experiments should be conducted 
to examine the performance of Z-modularity in more de¬ 
tails. Although strict experiments were conducted in the 
present study, other experimental settings are also possi¬ 
ble. As another future direction, the physical interpreta- 
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